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CLT for the proportion of 

infected individuals for an 
epidemic model on a complete 

graph 

F. Machado, H. Mashurian and H. Matzinger 

Abstract 

We prove a Central Limit Theorem for the proportion of infected 
individuals for an epidemic model by dealing with a discrete time sys- 
tem of simple random walks on a complete graph with n vertices. Each 
random walk makes a role of a virus. Individuals are all connected 
as vertices in a complete graph. A virus duplicates each time it hits 
a susceptible individual, dying as soon as it hits an already infected 
individual. The process stops as soon as there is no more viruses. This 
model is closely related to some epidemiologial models like those for 
virus dissemination in a computer network. 



: 1 Introduction 

We prove a Central Limit Theorem for the proportion of infected individuals 
for an epidemic model. We consider a discrete time system of simple random 
walks on K n , the n-complete graph, a graph with vertex set V = {1,2, ... ,n} 
and each pair of vertices linked by an edge. 

This model, also known as frog model, has been mostly considered on 
infinite graphs, in particular hypercubic lattices and homogeneous trees, for 
which results as shape theorem and phase transition have been proved. See 
for instance [2J, [3J, [I], [I], [E], [H], PH] and the references therein. A com- 
prehensive introduction on random walks on finite and infinite graphs can be 
found in [T]. 

In this paper we deal with a discrete time process on K n evolving as 
follows. At time zero there is one inactive particle at each vertex of K n . 
A particle is chosen to become active and by its turn that active particle 
chooses a vertex to jump at, also activating the particle sitting there. As at 
each time just one active particle makes a displacement, one active particle 
is uniformely choosen to make its move. From that time on, each active 
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particle perform a random walk on the vertices of K n , activating all inactive 
particles it meets along its way. Each active particle lives while it chooses 
vertices with an inactive particle on it, dying at the first time it chooses to 
jump on a vertex which has been visited before by some active particle. The 
process continues until there are no more active particles. 
Considering 

Vt = the number of vertices visited by the process up to time t, 

we denote by VJ» = lim^oo V t , the number of vertices which have been vis- 
ited by active particles when the process comes to an end. We investigate 
the asymptotic distribution of the random variable V^. The main result 
of this paper (Theorem 11.11) shows that properly re-scaled, converges in 
distribution to a normal random variable. 

Let us formally define the model whose dinamic takes place on K n . First 
we define A t , D t and I t as the number of active particles at time t, the 
number of vertices whose original particles have already died up to time t 
and the number of particles still inactive at time t, respectively. In this 
sense, V t = A t + D t and A t + D t + I t = n, for all discrete time t. Note that 
{(A t , D t , I t )}t>o is a Markov chain going 

{to (a + 1, n — (i + a), % — 1) w.p. 
or (1-1) 
to (a - 1, n - (% + a) + 1, i) w.p. 2jp, 

for discrete values of a 6 {1, 2, . . . , n} and i G {0, 1, 2, . . . , n — 1}. The chain 
starts from Aq = 1, Do = and Vq = n — 1 and comes to an end as soon 
as, for some discrete time t, A t = 0. Besides, let {S t }t>o denote a set of 
independent uniformly distributed random variables on V, the set of vertices 
of K n . At each time t one active particle (also uniformly chosen among 
the A t -i active particles), choose the vertex S t to jump to. It meets and 
activates a still inactive particle if and only if St ^ {Si, . . . , St_i}. In this 
case A t = A t -\ + 1. Otherwise that active particle dies, then A t = A t -\ — 1. 
Observe that A^ := lim^oo A t = 0. For simulations and mean field analysis 
see [3] 

Let q be the only non-zero solution to the equation 

2p = — ln(l — p) 

in [0, 1[. (See also lemma |2~2"1 ) Let /i r be equal to 

1 

Hr := 2 - . 

1 - q 
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Finally let a be equal to 



cr :- 



[1 x J q-2q 

Jo (i-x)* ax V"FT 



jjj r jjj r 

We are now ready to formulate the main theorem of this paper 
Theorem 1.1 We have that 

^^A/-(0,1) 

as n goes to infinity, where — > means convergence in law. 

This model can be viewed as an oriented dependent long range percola- 
tion model once one consider the analogous setup on an infinite connected 
graph. The main difficulty in answering the classical questions related to 
phase transition and shape theorem in this setup is that the classical cou- 
pling techniques cannot be applied, besides both FKG and BK inequalities 
fail. In [6] authors construe a very interesting renewal structure leading to a 
definition of regeneration times for which tail estimates are performed. 

Another possible approach and source of interest is to see this model as 
an option for modelling the spread of a disease in a population or spread 
of viruses in a computer network. Following the setup we use in this paper 
the virus duplicates any time it infects a susceptible individual. Once that 
happens the individual becomes immune. The virus dies the first time it tries 
to infect a immune individual. The population here is considered finite and 
have full contact as every individual can be contacted directly by any other 
individual. The main question we investigate in this paper corresponds to 
determine the distribution of the percentage of the population which escaped 
from the disease remaining not infected (but still susceptible) after all the 
virus are dead. For simulations and mean field analysis of this model see [5]. 

2 Main Ideas 

Let us define T(s), the time it takes for the process to reach s visited vertices. 
So, consistently with the process definition, T(0) = 1. For s G {2, . . . , n}, let 

T(s) = min{t e N : V t = s} 

and 

p = min{t : A t = 0}. 
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Observe that := lim^oo A t = A p . From (II. ip . note also that when 
there are s = n — i visited vertices, each active particle which jumps has a 
probability of s/n to die and a probability of (n — s)/n to hit an inactive 
particle. 

For all s such that the process has reached the level of s visited vertices, 
we define X s as the time the process spent at that level. Besides X s can also 
be seen as the (random) number of active particles which have to jump in so 
that the number of visited vertices either goes from s to s + 1 or the process 
finishes. 

For the number of visited vertices to go from s to s + 1, we need one 
additional unvisited vertex to be chosen. Hence, 



1 


w.p. 






k 


w.p. 




v n / 


A T (s) - 


- 1 w.p. 




-2) / n—s 
V n 


A T (s) 


w.p. 




-1) 



In other words 



X s ~ min{£( ),A T (s)} 

n 

where Q stands for the geometric probability distribution. 

Observe that for realizations of the process such that X s = At( s ), either 
the process stops at time T(s) + A T ^ and T(s + 1) = oo or T(s + 1) = 
T(s)+A T{s) . 

Going from s to s + 1 visited vertices, the change in the amount of active 
particles is designated by Y s . For s such that T(s + 1) < oo we define 

Y s := ^r(s+i) — A T ( S ) = 2 — X s . 

So we have that 

s-l 

Besides Y Vco = —A T ( Voo y Note that the variables Y~i, Y 2 , . . . , Yl are inde- 
pendents on the event {i < — 1}. They are not identically distributed. 

We now make up an approximation for the model by considering for 
s = 1,2,... a sequence of independent X s ~ and Y s = 2 — X s . 

Moreover we consider 
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i=i 

t := min{s : W s < 1}. 

Observe that on the event {i < — 1} it is possible to make a coupling 
such that (Xi = Xj) = Oa.s. and from this we have that p = r — 1. So, for 
what comes next we are interested in the random variable r. We show that r 
has expectation of order n, standard deviation of order n and when re-scaled 
properly converges to a normal variable. 

Let n s := -E[Y S ], so that 

n 

Note that up to s < n/2 we have pL s > 0. On the other hand for s > n/2 we 
find n s < 0. This means that about up to s = n/2 the random map s i— >■ 
increases and after s = n/2 it decreases. 

Let w s := £[W S ] and let W* := W s - w s , whilst Y* = Y> - E[Y t ]. By 
these definitions, we get 

W* = Y* + Y* + ■ ■ ■ + Y s *. 

The variables Y{, F 2 *, ... are independent. Let c < 1 be any constant not 
depending on n. Then for s < cn the variables Y$ with % < s are stochastically 
uniformly bounded by a geometric variable. Hence, W* is typically of order 
^/s when s < cn. On the other hand, s — > w s takes on values which are of 
order n. Hence, "the main shape" of s — > W s is "determined" by s — > w s 
whilst W* only represents a smaller fluctuation. 
We have for s < n, 

i=l ^ n' 

which implies that, 

rs/n y 

w s ~ n / 2 <ir. 

Jo 1 - z 

The integral in the expression on the right side of the above approximation, 
is equal 

2s/n + ln(l - s/n). 
The next lemma gives the precision of our approximation for w s . 
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Lemma 2.1 For all n and all s < n, we have: 

n [2{s/n) + ln(l - (s/n)] | < 3 + 



1 - (s/n)' 



(2.3) 



Proof. Let / denote a decreasing function on the interval [a, b}. Note that 
we have 



n . 
and hence 



n-l 



-£/ (a+(6-a)- < / f(y)dy<-J2f(a + (b 



i=0 



— a) — 

n 



l -j^f(a + {b-a) 1 -) 
n \ n J 

i=l x ' 



f{y)dy 



<-(\f(a)\ + \f(b)\). 



n 



(2.4) 



The last inequality also holds for increasing functions. Note that the map 
n-f2 - is everywhere monotone on [0, 1]. Hence we can apply to it 
inequality f )2.4p and find 



s/n ^ 

2-- 

1-2/ 



< 3 



1 - 



(2.5) 



The integral in the expression above can be calculated explicitly: 



s/n 



1 



i-y 



dy = 2{s/n) +ln(l - (s/n)). 



Plugging the expression into inequality f)2.5p yields the desired result. ■ 

We will see that we only need to consider values of s for which s < cn where 
c < 1 is a constant not depending on n. Hence the bound on the right side 
of ( 12. 3 p can be treated constant bound. 

The main result in this paper is concerned with finding the (random) zero r 
of the map W s . In the next lemma, we start by investigating the zeros of the 
map p — > 2p + ln(l — p), which is our first approximation of W s . 

Lemma 2.2 The map 

p^2p + ln{l-p) ; [0,1[->R 
has only one zero q g]0, 1[. Furthermore 

0.796 < q < 0.798 (2.6) 
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Proof. The derivative of our map is 2 — 1/(1 — p). It is strictly positive 
for p G [0; l/2[. So our map := 2p + ln(l — p) first increases from the 
value h(0) = to the positive value h(l/2) = 1 — In 2 > 0. After then the 
derivative of h(p) is strictly negative. Since h(l/2) > and h(l) = — oo, we 
infer that there is only one zero of the map h(p) in ]0; 1[. The bounds (12. 6p 
were obtained by numeric approximation from above and below. ■ 

Let r be equal to 

r := nq. 

Due to lemma 12. 1| we have that w r is close to zero up to a constant. In 
other words, r is approximatively equal to the zero of the map s H- w s . By 
definition, W s is equal to w s + W*, where typically w s takes on values of 
order n and W* takes on values of order yfn. This implies that the zero of 
s i — y W s is equal to the zero of w s plus/minus a term of order yfn. Hence, 
the stopping time r is typically equal to r = nq plus a random term with 
standard deviation of order yfn. 

How big is the standard deviation of r? For this, let us quickly look at 
another, related problem: assume that the variables Yi, Y2, . . . are i.i.d. vari- 
ables with finite second moment and E\Y\] < 0. Let K > be a large 
number, and let f be 

f := min{s\K + Yi + Y 2 + ... + Y a <0}. 

We find that f takes typically values which are about equal to i^/|i?[Y"i]| 
with a fluctuation of order \/K . (The proof is identical to the proof of the 
Law of Large Numbers and Central Limit Theorem for Renewal Processes). 
In our case, the variables Yi are not i.i.d but only independent. However, the 
variables with s close to r = nq have all about the same distribution that is 
geometric with expectation \i r . Note that 

1 

LL r /ion 2 - , 

1 - q 

is a number not depending on n. 

We saw that up to a constant factor, w r is approximately equal to zero and 
hence we have W* ~ W r . Assume that W* > 0. Then, for W s to become 
zero after s = r, we need about 

W r W* 

r^~J '_ 
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additional "steps", i.e. additional variables Y s . (The argument goes like the 
argument presented above for the variables Yj). This yields the approxima- 
tion 

w; Y* + ... + Y* n 

r ~ r = qn — . (2.7) 

H r fi r 

The above approximation is typically precise up to a term of order n 1 / 4 . This 
will be proven by introducing some events Bq, B™ and B% and showing that 
they when they hold (lemma 13. Ij) then the error in the last approximation 
above is of order n 1//4 . In the last section, we prove that the events Bfi, 
B™ and B% have their probabilities going to one when n goes to infinity. 
The expression on the right side of approximation (12. 7p gives the asymptotic 
behavior of the standard deviation of r. The reason is that the term 



fi r 



(2.8) 



has a standard deviation of order \fn^ whilst the error term of the approx- 
imation (12. 7p is of order n l / A . Hence, the standard deviation of (I2.8P is 
asymptotically equal to the standard deviation of r up to a much smaller er- 
ror term. Let us calculate the variance of the expression ( 12. 8p . We have that 
the variables Y* are re-centered geometric variables with parameter (n—i)/n. 
The variance of Y* is thus 

i/n 
(l-i/n) 2 

Hence we find that the variance of the sum (12.81) is equal to 

- 2 V VAR[Y*) = \ V - %/n . A2 (2.9) 

The sum in the above expression can be approximated by an integral. This 
is the content of the next lemma: 

Lemma 2.3 We have for all n and all q < 1 that 



qn . , „ 

Ei In I x 
-, r-T7T — n I ~, r^dx 
x (i - i/ n y J (i - xf 

2 



Proof. Let h(x) := x/(l — x) . We find that the derivative is equal to 

h'f \ 1 + x 
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which is positive for all x G [0, 1[. Hence, inequality (12 .4p can be applied and 
we find that inequality ( I2.10p holds. ■ 

The last lemma above implies that the standard deviation of (12. 8p is approx- 
imately equal to cr^/n, where 



ft) (l~x) 2 ^ X 

a 

fir 

Note that this is exactly the re-scaling factor used in our main theorem 11.11 



3 Combinatorics 

The first event Bq is the event that the first n 1//4 active random walks which 
jump in, do not get killed: 

Bo '■= {Vz, j < n 1//4 , with % ^ j we have, Si ^ Sj}. 

The next event B™ is the event that the approximation of W s by w s does not 
exceed the size lns-y/i. More precisely, B™ is the event that for all s with 

n 1//4 < s < qn, we have 

\W*\< hlSy/s. 

The next event B^ says that for all i such that < i < (\nn) 2 y/n we have 
that 

\Yqn+l + Yqn+2 + ■■■ + Yqn+i ~ «>r| < (Inn) 3 • 7l l/A 

and 

\Yqn-l + Yqn-2 + ■■■ + Yqn-i ~ ifl r \ < ■ 7l l/A 

Next comes our main combinatorial lemma 



Lemma 3.1 Assume that Bq, B™ and B^ all hold, then we have 

< 2(lnn) 3 -n 1/4 



Y * + Y 2 * + ... + Y* n 
t — I qn 

V Mr 

Proof. First, note that when Bq holds, then r is not in the interval [0, n 1//4 ]. 
Second, according to lemma 13.21 we have for all s contained in the interval 

[n 1/4 ,nq- (lnn) 2 ^ (3.1) 

that w s > In Sy/s. Hence, when B\ holds, and since by definition W s = 
W* + w s , we get that r is not in the interval (13.11) . 
Let s i— > f(s) be the (random) linear map 

f(s) = W qn + (s- qn)fi r . 
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Note that the map / has a zero at 



qn — 



W, 



qn 



//,. 



(Note that fi r is negative.) Let / + , resp. / be the linear map /+(hm) 3 -n 1//4 , 
resp. / — (Inn) 3 • n 1 / 4 . The zero of / + , resp. / _ is at 

W qn + (Inn) 3 -n 1 / 4 
qn , 



//,. 



resp. 



3 . _l/4 



qn 



W qn - (Inn) 3 • n 



Let I be the interval 

I := [qn — (In n^v^n, qn + (In n) 2 y/n\ 

and let J be the interval 

j r W 9 n- (lnn) 3 -^/ 4 W^ + Onn) 3 -^/ 4 
J := gn , qn . 

Note that when the event holds, then 

\W; n \<lnnVn~. (3.2) 

By definition 

W q n = W* n + W qn . (3.3) 

But by equality (12. 2p and by lemma I2.1[ we have for the constant k : = 
3 + 1/(1-9) 

1 



w m -n 2 



1 — X 



-dx 



< k. 



(3.4) 



By definition of q, we have 



Q 1 , 

2 dx = 0, 

1 — x 



so that with inequality (|3.4|) . we obtain 



The last inequality together with ( 13. 2p and (I3.3P implies 

I W q n | < k + In n\/w- 



(3.5) 



(3.6) 
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Using inequality ( I3.6p . we obtain that for n large enough 

J C I. 

Now, when the event B% holds, then in the interval I we have that W s is 
between /~ and / + , that is f~(s) < W s < f + (s) for s G J. Hence, in the 
interval I, the map s i— > W s has its zero between the zeros of /~ and / + . 
More precisely, this means that W s has a zero somewhere in the interval / 
and furthermore we have that all zeros of W s in the interval / are located in 
J. 

We can now summarize what we found so far: when Bq , B™ and B\ all hold, 
then the map s i— > W s has no zero before the interval /, but within / all the 
zeros are located in the subinterval J. Hence, r G J which implies 

< -(Inn) 3 -n 1 / 4 /^ 

Using the last equation together with f)3.5p and 03.3 j) . we find 

< -(Inn) 3 • n 1/4 /fx r + k 

Note that for n large enough, the right side of the last inequality is smaller 
than 2(lnn) 3 • n 1//4 . This finishes proving our lemma ■ 

Lemma 3.2 For all n large enough: every s contained in the interval 

[n 1/4 , nq- (lnn) 2 ^] (3.7) 

satisfies 

w s > In sy/s. (3.8) 

Proof. We consider the three intervals I\ = [n 1 / 4 , n/3], 1% '■— [n/3,n/2] and 
J 3 := [n/2, nq — (In n) 2 y/n\. We are going to prove that inequality f 13 . 8 j) holds 
for each one of them. Let h designate the map h(x) := 2x + ln(l — x). Note 
that the second derivative of h is negative everywhere on I\ for x = s/n. 
Hence, h'(x) > h'(l/3) > for all x G [0,1/3]. Since h(0) = 0, the mean 
value theorem implies that for all x G [0, 1/3], we have h(x) > x ■ h'(l/3). 
When sell then (s/n) G [0, 1/3] so that 

h(s/n) > (s/n) • /i'(l/3). (3.9) 

According to inequality ( 12. 3p . we have 

w s > nh(s/n) - (3 + 1/(1 - (s/n))) 



gn — 



qn 



t — I qn 

V \h ) 
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and for s G I\ since (s/n) < 1/3, we obtain 

w s > nh(s/n) — 4.5. 

The last inequality above together with inequality (13.91) then implies 

w s > s -h'(l/3) -4.5. (3.10) 

The expression on the right side of the last inequality above is larger than 
In Sy/s for s large enough. However for s £ I\, we have s > n 1//4 , so that for 
n large enough, s will be large enough and 

s ■ h'(l/3) -4.5 > lns^s. 

From the last inequality above and (13.101) . we have that inequality (13. 8p 
follows. 

Next we need to prove (13.81) for s in J 2 . Using inequality ( 12. 3 p together with 
the fact that s/n < 1/2 for s G I2, we find 

w s > nh(s/n) — 5. (3-H) 

When s G h we have that s/n G [1/3,1/2]. But on the interval [1/3,1/2] 
the map h is everywhere increasing. Hence for s G -Z2, we have that h(s/n) > 
h(l/3). Plugging the last inequality into (13.111) gives 

w s >nh{l/3) -5. (3.12) 

For s G I 2 we have s < n. Hence 

ha s y/s < In ny/n. (3.13) 

For n large enough, In n^fn is less than nh(l/3)— 5. From this and inequalities 
(13~T2|) and (13TT3|) inequality flUS) follows. 

Now, it only remains to prove inequality ( 13. 8 p for s G I3. When s 6 4 we 
have that s/n < q < 1. This together with inequality (12. 3p yields 

to s >n/i(s/n)-(3+l/(l-g)). (3.14) 

When s G -Z3, we have that 

(Inn) 2 " 



s/n G 



0.5, q- 



hi 

On the interval on the right side of the last inclusion above the map h is 
everywhere decreasing. Hence, for s/n G ^3 we have 

h{s/n) >h(q-0^). (3.15) 



n 
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Note that by definition h(q) = 0. Furthermore, h'(q) < 0. Hence, using the 
mean value theorem applied to ( I3.15p . we obtain that for all n large enough 



h(s/n) > -h'(q) 



(Inn) 2 
2^ ' 



(3.16) 



The last inequality together with (I3.14p . gives 



w s > -h'(q)^i(\Tin) 2 + (3 - 1/(1 - q)). 



(3.17) 



For n large enough, the right side of the last inequality above is larger than 
\nny/n which is larger than In s^/s when s G I3. Hence inequality (13. 8 p holds. 



4 Probabilities 

Lemma 4.1 We have that P(Bq) — > 1 as n — > 00. 

Proof. Let be the event that the i-th frog jumping in does not die. 
Hence, is the event that NSi 7^ S t for all t < i. For an event A n we 
designate by A nc its complement. We have that 



n 



1/4 




i=l 



and hence 



n 



1/4 




(4.1) 



i=l 



Now, for i < n l l 4 there are no more than n 1//4 vertices and hence the proba- 
bility for the i-th jumping frog to die is not more than n 1//4 jn = n~ 3 ^. This 
immediately implies that 




Using the last inequality with inequality (14. ip . we find 



n 1 / 4 1 



This finishes to prove our lemma. ■ 
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Lemma 4.2 There exist two constants k > and c > such that for every 
geometric variable X with parameter p satisfying 



p £ [1 - q, 1] (4.2) 

and every A £ [0, c], we /jaue 

£[ e (X-(l/p)-A).*A] < e -0.5A^ (4 3) 

Proof. Let k be equal to 

k := min — -, (4.4) 

1.1(1 -p)(o.i+ P y k ' 

where the minimum is taken over all p £ [1 — q, 1]. Note that p is bounded 
away from zero and k > 0. 

Let Ci > be a number such that for all A £ [0, ci] we have 

ln / _ AV(l-p)(0.1+p) \ > u AV(l-p)(0.1+ri 
\ 2p 3 / ~~ 2p 3 

Such a number c\ > exists since for all s > small enough we have 
ln(l — s) > —1.1s and since k 2 (1 —p)(0.1 +p)/(2p 3 ) admits a uniform finite 
upper bound for p £ [1 — q, 1]. 

Let C2 > be a number such that for all A £ [0, C2] we have 

e A K / P < x + AK / p + i.iAV/(2p 2 ). (4.6) 

Such a number C2 > exists since for all s > small enough we have 
e s < 1 + s + l.ls 2 /2. Let c = min{ci,C2}. Hence when A £ [0,c] we have 
that both conditions (14.51) and (14. 6p are satisfied. 
Now for the geometric variable X with parameter p we have that 

00 

£[ e (^-(l/p)-A)tj = e mt-t/p-At^ _ p jm-l p 

m=l 

Using the formula Y2m=i a?n = a /(l ~ a )' we ^ n( ^ that f° r ^ sman enough 

£[e (Jf-(i/ P )-A)«] = -t/ P -tA e = 

L J ^ l-(l-p)e* l-e*(l-p) 

For t = kA, we obtain 

Ak(1-(1/ p )-A) np -A 2 K 

R[ e (X-(l/p)-A)A«l = Pf = Pf U >7\ 

L J ! _ e A/e(l _ p ) e A«(l-p)/p - (1 _ p) e A K / P ' ^ * > 
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Note that for any s > we have e s > 1 + s + s 2 /2. Hence for s = Ak(1 —p)/p 
we find 

e A K (i~ P )/ P > j + Afi .( 1 _ p)/ p + A V(l - p) 2 /2p 2 (4.8) 

Applying inequalities (14. 8p and (14. 6 p to the expression on the right side 
of inequality (14. 7p . we find 

£[ e (X-(l/p)-A)A«j 

< Afc(l-p) , A 2 K 2 (l-p) 2 7, ~T (l-p)A/c l.l(l-p)A 2 K 2 



p 1 2p 2 V" 1 ' f 1 p 2p 2 

A 2 K 



i+^^+ ^r pr -(i-p) 

pe 

, A 2 « 2 (l~p) 2 l.l(l-p)A 2 /t 5 " 
P + 2p 2 2p 2 

g -A 2 K 



1 , A 2 K 2 (l-p) 2 _ l.l(l-p)A 2 K 2 

2p 3 2p 3 

exp(-A 2 /t) • exp(- ln(l - AV(1 - p)(0.1 + p)/2p 3 )) 



Applying inequality (I4.5P to the most right expression in the last chain of 
inequalities above we find 

£|g(X-(l/p)-A)Aisi < e -A 2 K . e l.lA 2 K 2 (l-p)(0.1+p)/2p 3 ^ q n 

< e -A 2 K (l-l.l K (l-p)(0.1+p)/2p 3 )_ (4.10) 

By the definition (14. 4 p of k, we have 

K < 



1.1(1 -p)(0.1+p) 
and hence 

i.i«(i-p)(o.i + p) >Q5 

2p 3 ~ 

The last inequality above applied to (I4.10p yields 

j5[ e (X-(l/p)-A)A«] < e ~0.5A 2 K> 



We can prove the same type of inequality as the one in the lemma above for 
the variable —X. Hence, we assume that there exist c > and k > such 
that for all p G [1 — q, 1] we have that condition (14. 3p is satisfied as well as 

£[ e (-*+(l/p)-A)-KAj < e -o.5A 2 K (4.11) 
where again X is a geometric variable with parameter p. 
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Lemma 4.3 We have that P(P") — > 1 as n — > oo. 
Proof. Let 5n s be the event 

Bib := {Y,* + Y* + ... + Y s *< \ns^} 

and let 

B 12s ■= {-Y* -Y*-...-Y;< Ins^} 

We have that 

nq 

b?= f| (%n%) 

and hence 

p(sn < £ p(p 1 c 1 j + £ p(p 1 c 2 j (4.i2) 

s=n 1 / 4 s=n!/4 

Recall that for every t > and any variable Z we have 

P(Z > 0) < P[e* z ]. (4.13) 

We have 

P(B c Us ) = P((Y* - A) + (Y* - A) + . . . + (Y; - A) > 0) 
where A := his/\/s. Using inequality (I4.13P yields 

P(B c lls ) < S [ e *((i?-A)+(y 3 --A)+... + (y.--A))] = "Q £ [ e w-A)] (414) 

i=l 

But Y* = —Xi + l/p«, where Xi is a geometric variable with parameter pi, 
since by definition Y* = Yi — E(Yi) and Y{ = 2 — X{. Therefore taking t = nA 
and applying inequality (14. lip to (14.141) . we obtain 

P{B c lls ) < e -°- 5KA2s = s ~°- 5Klns . (4.15) 
Similarly one can prove 

P(B c 12s ) < e -°- 5KA2s = s ~°- 5Klns . (4.16) 
Applying inequalities (14.151) and (14.161) to inequality (I4.12p finally gives 

nq 

P(B[ LC ) < ^ 2s~°- 5Klns 
and hence P(P™ C ) goes to zero as n — > oo. ■ 

To prove that the event PJ nas high probability we first need the following 
lemma: 
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Lemma 4.4 for all i such that < i < (lnn) 2 ^/n we have that 

l/V+l + Hqn+2 + • • • + Hqn+i ~ «/V| < O nf 



and 



l/V-l + + • • • + AV-i _ *A*r| < ( m ™) 



(4.17) 



(4.18) 



Proof. Let / be the map defined by f(x) := 2 — (1/(1 — x)). Note that / 
is continuously differentiate in a neighborhood of x = q. Hence there exists 
5 > such that for all A e [—5, 5] , we have 



\f(q + A)-f(q)\<c-A 



(4.19) 



where c > is a constant not depending on A. Note that when i satisfies 
< i < (lnn) 2 ^^ then 



< 



(Inn)' 



The right side of the last inequality above goes to zero as n — » oo and hence 
for n large enough it is less than 5. We assume now that n is large enough 
so that 

<s, 



from which by (14.191) we get 



n n \/n 



and equivalently 



\l^qn+i i^r | ^ C 



(lnn) 



Applying the last inequality above to the expression on the left side of in- 
equality (I4.17P gives 

(In n) 2 

l/V+l + /V+2 + • • • + Hqn+i -Wr\ < ^ — < c(hin) 4 . 



n 



The term on the right side of the last inequality above for n large enough 
is less than (Inn) 5 which finishes proving f)4.17p . In a similar way we prove 

(gup . ■ 



Lemma 4.5 We have that P(B%) — > 1 as n oo 
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Proof. Hint: Use lemma S3] and the Hoeffding inequality. ■ 



Proof of the main theorem 11.11 Lemma [37X1 states that when Bfi, B™ and 



B^ all hold then 



qn 



1 1 2 ■ ■ ■ 1 c 



qn 



cr\/n 



< 



2 (Inn) 



an 



1/4 



(4.20) 



From lemma 12.31 and equality (I2.9P it follows that the standard deviation of 
the sum 

Y* + Y* + ... + Y* n 
is equal up to a constant term to 




x 



1-x) 



;dXy/n. 



Hence by the Central Limit Theorem for independent but non-identical vari- 
ables we have that the re-scaled sum 

I I I 2 ■ ■ ■ 1 qn 



\l Jo o^d^V™ 



converges weakly to a Standard Normal variable. From this and from the 
fact that inequality f)4.20p holds with probability converging to one when n 
goes to infinity we get that (r — qn) / (a^/n) converges weakly to a standard 
normal. We also used the fact that the right side of f)4.20p goes to zero as n 
goes to infinity. Inequality ( I4.20p holds with probability going to one when 
n goes to infinity, because the events B£ , B™ and B^ , which together imply 
f)4.20p . all have their probabilities going to one as n goes to infinity. 
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